# Replication Package

This directory contains the complete replication workflow for "Spatial Correlation, Trade, and Inequality: Evidence from the Global Climate" in REStat by Jonathan I. Dingel and Kyle C. Meng.

Replication package authored by Kyle C. Meng.

## Directory Structure

```
prep_data/                      # Data preparation scripts
processeddata/                  # Data processing
calibration_dataprep/           # Data preparation for calibration tasks
configure_julia/                # Julia package environment setup
calibration_2013_2099/          # Climate projection simulations
analysis_statistics/            # Statistical analysis and output generation
calibration/                    # Bilateral productivity swap simulations (Figures A4, E1)
EK_nested_frechet/              # Eaton-Kortum nested Frechet simulations (Table A3)
circular_geography_exhibits/    # Circular geography exhibits (Tables A1-A2, Figures 2, A2, A3, A5)
commonfunctions/                # Shared Julia functions
modelsolvingfunctions/          # Model-solving Julia functions
toolbox/                        # Shared utilities for Stata and MATLAB
```

## Setup

Before running the replication package, uncompress the file `prep_data/input/rawData.tar` by running the following command in `prep_data/input`:
```
tar -xvf rawData.tar
```

## Complete Workflow
To run the entire replication package from start to finish, run the following command from the directory in which this README file is located:

```bash
make -f run.make
```

You can also run `make` in each individual folder, in the following order:
- prep_data
- processeddata
- calibration_dataprep
- configure_julia
- calibration_2013_2099
- analysis_statistics
- calibration
- EK_nested_frechet
- circular_geography_exhibits

You need to run them in this order because downstream tasks rely on upstream tasks.

## Data Sources

The scripts process data from the following sources:

### Geography
- **Country Crosswalks** (prep_countrycrosswalks.do) - ISO3 country codes and mapping between different coding systems
- **GIS Data** (prep_gis.do) - Geographic coordinates for countries (excludes Antarctica)

### Climate & Weather
- **Weather Data** (prep_weather.do) - Temperature and precipitation data by crop area
- **CMIP5 Projections** (prep_cmip.do) - Climate model projections
- **ENSO Indices** (prep_enso.do) - Kaplan extended NINO indices (NINO1+2, NINO3, NINO4, NINO3.4) with monthly and annual aggregations

### Economy
- **World Bank WDI** (prep_wdi.do) - World Development Indicators
- **FAO Data**
  - prep_fao_production.do - Agricultural production data by cereal type
  - prep_fao_prices.do - Producer prices for cereals
  - prep_fao_trade.do - Country-level import/export agricultural trade flows
  - prep_fao_storage.do - Cereal storage data (YOY net drawdown)
- **Comtrade Data**:
  - prep_comtrade_raw.do - Raw Comtrade extracts
  - prep_comtrade_ToT_food.do - Terms of trade for food products
  - prep_comtrade_ToT_cereal.do - Terms of trade for cereals
- **WITS Tariffs** (prep_tariffs.do) - Bilateral tariff data from WITS
- **TRAINS** (prep_trains.do) - Non-tariff measures (NTMs)
- **Agricultural Distortions** (prep_agdistortions.do) - Agricultural trade policy distortions
- **Agricultural Tariffs** (prep_agtariffs.do) - Agricultural-specific tariff data
- **Oil Prices** (prep_oilprice.do) - Global oil price index

## Exhibits

The following are the scripts that generate tables and figures in the paper:

### analysis_statistics/code
- Tables 1, 2, B1, C1, F1-F11: `table*.do`
- Figures 1a, 1b, 3b, 4a-b, 5: `figure*.do`
- Figures B1a-b, E3, E4, E6a-b, E7a-b, E8, E9, E10, E11-E14: `figureE*.do`
- Figures E2, E5: `figureE2.m`, `figureE5.m` (MATLAB)

### calibration/code
- Figures A4, E1: `bilateralswaps_analysis_moreargs.do`
- Simulations: `bilateralswaps_moreargs.jl`

### EK_nested_frechet/code
- Table A3: `table_EKNF_beta1.do`
- Simulations: `EK_nested_frechet_welfare_analysis.jl`

### circular_geography_exhibits/code
- Tables A1, A2: `sinewave_calls.jl`
- Figures 2a, 2b, A2a, A2b, A5: `sinewave_calls.R`
- Figure A3: `generalcase_figure_masterpaper.R`

## Task Runtimes
For most tasks, we report runtime approximations for a 2025 Macbook Air with an Apple M4 chip and 16GB RAM.
For the `calibration` task, we report jobs on Columbia University's Shared Research Computing Facility, 
which has Intel Xeon Platinum 8460Y 2 Ghz processors.
The following table lists the computation time for each task:

| Task | Runtime |
|------|---------|
| prep_data | approx. 30 min |
| processeddata | < 10 seconds |
| calibration_dataprep | < 10 seconds |
| configure_julia | < 10 seconds |
| calibration_2013_2099 | approx. 1 hour |
| analysis_statistics | approx. 2 hours 20 min |
| calibration | approx. 15 hours 25 min (HPC cluster: 16 CPUs, 400GB RAM) |
| EK_nested_frechet | approx. 1 hour 33 min |
| circular_geography_exhibits | approx. 1 min 20 sec |

## Software Requirements

### Required Software

- **Stata** (version 16 or higher recommended)
  - Required packages are included in `toolbox/STATA`

- **R** (version 4.1.0)
  - Required for generating figures in `circular_geography_exhibits`
  - Scripts will automatically install any missing R packages.

- **Julia** (version 0.7)
  - Install this old version of Julia using `juliaup` so that `julia +0.7` is a valid command on your machine
  - Required Julia packages are defined via `Project.toml` and `Manifest.toml` in `configure_julia`
  - Install required packages by running `make` in `configure_julia/code`

- **MATLAB**
  - Required functions are included in `toolbox/MATLAB/`
  - Required to run `figureE2.m` and `figureE5.m`


### Toolbox Dependencies

The `toolbox/` directory contains shared utilities:
- `toolbox/shell_functions.make` - Shell functions for Makefiles
- `toolbox/STATA/filepaths.do` - File path configuration
- `toolbox/MATLAB/` - MATLAB utility functions

## Notes

- The workflow uses GNU Make to manage dependencies between scripts
- Individual .do files can be run independently if their input dependencies are satisfied
- Stamp files (.stamp) are used to track when scripts that produce multiple outputs have completed
